80 research outputs found

    A Unified Encoder-Decoder Framework with Entity Memory

    Full text link
    Entities, as important carriers of real-world knowledge, play a key role in many NLP tasks. We focus on incorporating entity knowledge into an encoder-decoder framework for informative text generation. Existing approaches tried to index, retrieve, and read external documents as evidence, but they suffered from a large computational overhead. In this work, we propose an encoder-decoder framework with an entity memory, namely EDMem. The entity knowledge is stored in the memory as latent representations, and the memory is pre-trained on Wikipedia along with encoder-decoder parameters. To precisely generate entity names, we design three decoding methods to constrain entity generation by linking entities in the memory. EDMem is a unified framework that can be used on various entity-intensive question answering and generation tasks. Extensive experimental results show that EDMem outperforms both memory-based auto-encoder models and non-memory encoder-decoder models.Comment: Accepted by the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022

    Spatio-Temporal Kronecker Compressive Sensing for Traffic Matrix Recovery

    Get PDF
    A traffic matrix is generally used by several network management tasks in a data center network, such as traffic engineering and anomaly detection. It gives a flow-level view of the network traffic volume. Despite the explicit importance of the traffic matrix, it is significantly difficult to implement a large-scale measurement to build an absolute traffic matrix. Generally, the traffic matrix obtained by the operators is imperfect, i.e., some traffic data may be lost. Hence, we focus on the problems of recovering these missing traffic data in this paper. To recover these missing traffic data, we propose the spatio-temporal Kronecker compressive sensing method, which draws on Kronecker compressive sensing. In our method, we account for the spatial and temporal properties of the traffic matrix to construct a sparsifying basis that can sparsely represent the traffic matrix. Simultaneously, we consider the low-rank property of the traffic matrix and propose a novel recovery model. We finally assess the estimation error of the proposed method by recovering real traffic

    A Survey of Multi-task Learning in Natural Language Processing: Regarding Task Relatedness and Training Methods

    Full text link
    Multi-task learning (MTL) has become increasingly popular in natural language processing (NLP) because it improves the performance of related tasks by exploiting their commonalities and differences. Nevertheless, it is still not understood very well how multi-task learning can be implemented based on the relatedness of training tasks. In this survey, we review recent advances of multi-task learning methods in NLP, with the aim of summarizing them into two general multi-task training methods based on their task relatedness: (i) joint training and (ii) multi-step training. We present examples in various NLP downstream applications, summarize the task relationships and discuss future directions of this promising topic.Comment: Accepted to EACL 2023 as regular long pape

    Quality assessment for virtual reality technology based on real scene

    Get PDF
    Virtual reality technology is a new display technology, which provides users with real viewing experience. As known, most of the virtual reality display through stereoscopic images. However, image quality will be influenced by the collection, storage and transmission process. If the stereoscopic image quality in the virtual reality technology is seriously damaged, the user will feel uncomfortable, and this can even cause healthy problems. In this paper, we establish a set of accurate and effective evaluations for the virtual reality. In the preprocessing, we segment the original reference and distorted image into binocular regions and monocular regions. Then, the Information-weighted SSIM (IW-SSIM) or Information-weighted PSNR (IW-PSNR) values over the monocular regions are applied to obtain the IW-score. At the same time, the Stereo-weighted-SSIM (SW-SSIM) or Stereo-weighted-PSNR (SW-PSNR) can be used to calculate the SW-score. Finally, we pool the stereoscopic images score by combing the IW-score and SW-score. Experiments show that our method is very consistent with human subjective judgment standard in the evaluation of virtual reality technology

    Internal Cross-layer Gradients for Extending Homogeneity to Heterogeneity in Federated Learning

    Full text link
    Federated learning (FL) inevitably confronts the challenge of system heterogeneity in practical scenarios. To enhance the capabilities of most model-homogeneous FL methods in handling system heterogeneity, we propose a training scheme that can extend their capabilities to cope with this challenge. In this paper, we commence our study with a detailed exploration of homogeneous and heterogeneous FL settings and discover three key observations: (1) a positive correlation between client performance and layer similarities, (2) higher similarities in the shallow layers in contrast to the deep layers, and (3) the smoother gradients distributions indicate the higher layer similarities. Building upon these observations, we propose InCo Aggregation that leverags internal cross-layer gradients, a mixture of gradients from shallow and deep layers within a server model, to augment the similarity in the deep layers without requiring additional communication between clients. Furthermore, our methods can be tailored to accommodate model-homogeneous FL methods such as FedAvg, FedProx, FedNova, Scaffold, and MOON, to expand their capabilities to handle the system heterogeneity. Copious experimental results validate the effectiveness of InCo Aggregation, spotlighting internal cross-layer gradients as a promising avenue to enhance the performance in heterogenous FL.Comment: Preprint. Under revie

    3D radiomics predicts EGFR mutation, exon-19 deletion and exon-21 L858R mutation in lung adenocarcinoma

    Get PDF
    Background: To establish a radiomic approach to identify epidermal growth factor receptor (EGFR) mutation status in lung adenocarcinoma patients based on CT images, and to distinguish exon-19 deletion and exon-21 L858R mutation. Methods: Two hundred sixty-three patients who underwent pre-surgical contrast-enhanced CT and molecular testing were included, and randomly divided into the training (80%) and test (20%) cohort. Tumor images were three-dimensionally segmented to extract 1,672 radiomic features. Clinical features (age, gender, and smoking history) were added to build classification models together with radiomic features. Subsequently, the top-10 most relevant features were used to establish classifiers. For the classifying tasks including EGFR mutation, exon-19 deletion, and exon-21 L858R mutation, four logistic regression models were established for each task. Results: The training and test cohort consisted of 210 and 53 patients, respectively. Among the established models, the highest accuracy and sensitivity among the four models were 75.5% (61.7-86.2%) and 92.9% (76.5-99.1%) to classify EGFR mutation, respectively. The highest specificity values were 86.7% (69.3-96.2%) and 70.4% (49.8-86.3%) to classify exon-19 deletion and exon-21 L858R mutation, respectively. Conclusions: CT radiomics can sensitively identify the presence of EGFR mutation, and increase the certainty of distinguishing exon-19 deletion and exon-21 L858R mutation in lung adenocarcinoma patients. CT radiomics may become a helpful non-invasive biomarker to select EGFR mutation patients for invasive sampling
    • …
    corecore